Fuzzy optimality relation for perceptive MDPs - the average case
نویسندگان
چکیده
This paper is a sequel to Kurano et al [9], [10], in which the fuzzy perceptive models for optimal stopping or discounted Markov decision process is given. We proposed a method of computing the corresponding fuzzy perceptive values. Here, we deal with the average case for Markov decision processes with fuzzy perceptive transition matrices and characterize the optimal average expected reward, called the average perceptive value, by a fuzzy optimality relation. Also, we give a numerical example.
منابع مشابه
Fuzzy Optimality Equations for Perceptive MDPs
This paper is a sequel to Kurano et al [9], [10], in which the fuzzy perceptive models for optimal stopping or discounted Markov decision process are proposed and the methods of computing the corresponding fuzzy perceptive values are given. Here, we deal with the average case for Markov decisin processes with fuzzy perceptive transition matrices and characterize the optimal average expected rew...
متن کاملFuzzy Perceptive Values for MDPs with Discounting
In this paper, we formulate the fuzzy perceptive model for discounted Markov decision processes in which the perception for transition probabilities is described by fuzzy sets. The optimal expected reward, called a fuzzy perceptive value, is characterized and calculated by a new fuzzy relation. As a numerical example, a machine maintenance problem is considered.
متن کاملPerceptive Evaluation for the Optimal Discounted Reward in Markov Decision Processes
We formulate a fuzzy perceptive model for Markov decision processes with discounted payoff in which the perception for transition probabilities is described by fuzzy sets. Our aim is to evaluate the optimal expected reward, which is called a fuzzy perceptive value, based on the perceptive analysis. It is characterized and calculated by a certain fuzzy relation. A machine maintenance problem is ...
متن کاملA Generalized Reinforcement-Learning Model: Convergence and Applicationa
Reinforcement learning is the process by which an autonomous agent uses its experience interacting with an environment to improve its behavior. The Markov decision process (mdp) model is a popular way of formalizing the reinforcement-learning problem, but it is by no means the only way. In this paper, we show how many of the important theoretical results concerning reinforcement learning in mdp...
متن کاملOn the Reduction of Total-Cost and Average-Cost MDPs to Discounted MDPs
This paper provides conditions under which total-cost and average-cost Markov decision processes (MDPs) can be reduced to discounted ones. Results are given for transient total-cost MDPs with transition rates whose values may be greater than one, as well as for average-cost MDPs with transition probabilities satisfying the condition that there is a state such that the expected time to reach it ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Fuzzy Sets and Systems
دوره 158 شماره
صفحات -
تاریخ انتشار 2007